Overview

Dataset Statistics

Number of Variables 12
Number of Rows 891
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 83.7 KB
Average Row Size in Memory 96.1 B
Variable Types
  • Numerical: 6
  • Categorical: 6

Dataset Insights

PassengerId is uniformly distributed Uniform
Name is uniformly distributed Uniform
PassengerId and Name have similar distributions Similar Distribution
Age is skewed Skewed
Fare is skewed Skewed
Cabin is skewed Skewed
Survived has constant length 1 Constant Length
Pclass has constant length 1 Constant Length
Sex has constant length 1 Constant Length
SibSp has constant length 1 Constant Length
Parch has constant length 1 Constant Length
Embarked has constant length 1 Constant Length
  • 1
  • 2

Variables

PassengerId

numerical

Approximate Distinct Count 891
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 446
Minimum 1
Maximum 891
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • PassengerId is uniformly distributed

Quantile Statistics

Minimum 1
5-th Percentile 45.5
Q1 223.5
Median 446
Q3 668.5
95-th Percentile 846.5
Maximum 891
Range 890
IQR 445

Descriptive Statistics

Mean 446
Standard Deviation 257.3538
Variance 66231
Sum 397386
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.577
  • PassengerId is not normally distributed (p-value 7.259388077973426e-05)

Survived

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (0) is over 1.61 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 1
3rd row 1
4th row 1
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 1.61 times larger than the second largest value (1)
  • Survived has words of constant length

Pclass

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (3) is over 2.27 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 3
2nd row 1
3rd row 3
4th row 1
5th row 3

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (3, 1) take over 50.0%
  • The largest value (3) is over 2.27 times larger than the second largest value (1)
  • Pclass has words of constant length

Name

numerical

Approximate Distinct Count 891
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 445
Minimum 0
Maximum 890
Zeros 1
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Name is uniformly distributed

Quantile Statistics

Minimum 0
5-th Percentile 44.5
Q1 222.5
Median 445
Q3 667.5
95-th Percentile 845.5
Maximum 890
Range 890
IQR 445

Descriptive Statistics

Mean 445
Standard Deviation 257.3538
Variance 66231
Sum 396495
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.5783
  • Name is not normally distributed (p-value 7.259388077973426e-05)

Sex

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (1) is over 1.84 times larger than the second largest value (0)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 0
3rd row 0
4th row 0
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (1, 0) take over 50.0%
  • The largest value (1) is over 1.84 times larger than the second largest value (0)
  • Sex has words of constant length

Age

numerical

Approximate Distinct Count 88
Approximate Unique (%) 9.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 29.3616
Minimum 0.42
Maximum 80
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Age is skewed right (γ1 = 0.5094)

Quantile Statistics

Minimum 0.42
5-th Percentile 6
Q1 22
Median 28
Q3 35
95-th Percentile 54
Maximum 80
Range 79.58
IQR 13

Descriptive Statistics

Mean 29.3616
Standard Deviation 13.0197
Variance 169.5125
Sum 26161.17
Skewness 0.5094
Kurtosis 0.9816
Coefficient of Variation 0.4434
  • Age is not normally distributed (p-value 3.767259445176939e-21)
  • Age has 66 outliers

SibSp

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.8%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (0) is over 2.91 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 0
4th row 1
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 2.91 times larger than the second largest value (1)
  • SibSp has words of constant length

Parch

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.8%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (0) is over 5.75 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 5.75 times larger than the second largest value (1)
  • Parch has words of constant length

Ticket

numerical

Approximate Distinct Count 681
Approximate Unique (%) 76.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 338.5286
Minimum 0
Maximum 680
Zeros 3
Zeros (%) 0.3%
Negatives 0
Negatives (%) 0.0%
  • Ticket is skewed right (γ1 = 0.0002)

Quantile Statistics

Minimum 0
5-th Percentile 33.5
Q1 158.5
Median 337
Q3 519.5
95-th Percentile 641.5
Maximum 680
Range 680
IQR 361

Descriptive Statistics

Mean 338.5286
Standard Deviation 200.8507
Variance 40340.9865
Sum 301629
Skewness 0.00024536
Kurtosis -1.2778
Coefficient of Variation 0.5933
  • Ticket is not normally distributed (p-value 0.0006635785101060624)

Fare

numerical

Approximate Distinct Count 248
Approximate Unique (%) 27.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 32.2042
Minimum 0
Maximum 512.3292
Zeros 15
Zeros (%) 1.7%
Negatives 0
Negatives (%) 0.0%
  • Fare is skewed right (γ1 = 4.7793)

Quantile Statistics

Minimum 0
5-th Percentile 7.225
Q1 7.9104
Median 14.4542
Q3 31
95-th Percentile 112.0791
Maximum 512.3292
Range 512.3292
IQR 23.0896

Descriptive Statistics

Mean 32.2042
Standard Deviation 49.6934
Variance 2469.4368
Sum 28693.9493
Skewness 4.7793
Kurtosis 33.2043
Coefficient of Variation 1.5431
  • Fare is not normally distributed (p-value 5.925743764895219e-18)
  • Fare has 116 outliers

Cabin

numerical

Approximate Distinct Count 147
Approximate Unique (%) 16.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 13.9 KB
Mean 53.6397
Minimum 0
Maximum 146
Zeros 1
Zeros (%) 0.1%
Negatives 0
Negatives (%) 0.0%
  • Cabin is skewed right (γ1 = 2.2651)

Quantile Statistics

Minimum 0
5-th Percentile 35.5
Q1 47
Median 47
Q3 47
95-th Percentile 117.5
Maximum 146
Range 146
IQR 0

Descriptive Statistics

Mean 53.6397
Standard Deviation 23.5683
Variance 555.4644
Sum 47793
Skewness 2.2651
Kurtosis 5.4458
Coefficient of Variation 0.4394
  • Cabin is not normally distributed (p-value 4.261755372042157e-25)
  • Cabin has 200 outliers

Embarked

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 57.4 KB
  • The largest value (2) is over 3.85 times larger than the second largest value (0)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 2
2nd row 0
3rd row 2
4th row 2
5th row 2

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 891
  • The top 2 categories (2, 0) take over 50.0%
  • The largest value (2) is over 3.85 times larger than the second largest value (0)
  • Embarked has words of constant length

Interactions

Correlations

Missing Values